Adapting RANSAC SVM to Detect Outliers for Robust Classification

نویسندگان

  • Subhabrata Debnath
  • Anjan Banerjee
  • Vinay P. Namboodiri
چکیده

In this paper we address the problem of classifying objects where some of the labels in the training data are noisy. This is a common scenario and can be caused by the difficulty of annotation or inadvertently due to human error. In this paper, we consider the wrongly annotated examples to be outliers and try to formulate a robust outlier identification algorithm. The task of learning a model in the presence of noise has been traditionally solved by the RANSAC algorithm[1]. RANSAC has also been adapted as RANSAC SVM by Nishida and Kurita [3]. The RANSACSVM method selects random subsets of the training data and trains small SVMs on them, using the rest of the training data as validation sets. It then chooses the SVM with the smallest validation error to approximate the full training set. However, if the training data is noisy, the validation sets are corrupted, and a faulty submodel may be chosen as optimal. To address this problem, we propose a modification to RANSAC SVM thereby achieving robustness to noise. The detailed algorithm is given in Algorithm 1. We are initially given a training set S of n examples from which we draw small random subsets of size k. For each such subset, we train a SVM to obtain a weight vector wi using the standard support vector formulation given by:

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Debnath, Banerjee, Namboodiri: Adapting Ransac-svm to Detect Outliers for Robust Classification

Most visual classification tasks assume the authenticity of the label information. However, due to several reasons such as difficulty of annotation or inadvertently due to human error, the annotation can often be noisy. This results in wrongly annotated examples. In this paper, we consider the examples that are wrongly annotated to be outliers. The task of learning a robust inlier model in the ...

متن کامل

Robustified distance based fuzzy membership function for support vector machine classification

Fuzzification of support vector machine has been utilized to deal with outlier and noise problem. This importance is achieved, by the means of fuzzy membership function, which is generally built based on the distance of the points to the class centroid. The focus of this research is twofold. Firstly, by taking the advantage of robust statistics in the fuzzy SVM, more emphasis on reducing the im...

متن کامل

Outlier Detection for Support Vector Machine using Minimum Covariance Determinant Estimator

The purpose of this paper is to identify the effective points on the performance of one of the important algorithm of data mining namely support vector machine. The final classification decision has been made based on the small portion of data called support vectors. So, existence of the atypical observations in the aforementioned points, will result in deviation from the correct decision. Thus...

متن کامل

RANSAC-Based Training Data Selection for Speaker State Recognition

We present a Random Sampling Consensus (RANSAC) based training approach for the problem of speaker state recognition from spontaneous speech. Our system is trained and tested with the INTERSPEECH 2011 Speaker State Challenge corpora that includes the Intoxication and the Sleepiness Subchallenges, where each sub-challenge defines a two-class classification task. We aim to perform a RANSAC-based ...

متن کامل

A Comparative Analysis of RANSAC Techniques Leading to Adaptive Real-Time Random Sample Consensus

The Random Sample Consensus (RANSAC) algorithm is a popular tool for robust estimation problems in computer vision, primarily due to its ability to tolerate a tremendous fraction of outliers. There have been a number of recent efforts that aim to increase the efficiency of the standard RANSAC algorithm. Relatively fewer efforts, however, have been directed towards formulating RANSAC in a manner...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015